Highly multiplexed imaging similarly to FISH-based spatial transcriptomics allows the detection of dozens of biomolecules in single cells across tissue sections. Upon image processing and segmentation, the protein/RNA expression as well as the location and morphological features of individual cells are extracted for downstream analysis. We developed the steinbock framework to support image pre-processing, segmentation, feature extraction, and data export in a reproducible fashion. Single-cell, spatial data analysis and visualization are facilitated by the R/Bioconductor packages imcRtools and cytomapper. The imcRtools package allows the construction of spatial object graphs in which nodes represent cells and edges indicate cells in close physical proximity. The package further supports the visualization of these graphs together with the location of cells across multiple images. Spatial cellular neighborhoods can be detected by aggregating the phenotypes or expression across neighboring cells and using this information for cell clustering. A supervised spatial clustering approach is provided by detecting fully connected cells of the same phenotype prior to polygon expansion. The imcRtools package allows testing for enriched cell type/cell type interactions or avoidance by permuting cell labels per image. Finally, the cytomapper package allows the visualization of detected cell types or spatial communities directly on segmentation mask together with protein/marker expression. Together, the tools described here specifically support the spatial analysis and visualization of single-cell data. By using standardized data classes, the packages integrate into single-cell analysis workflows within the Bioconductor framework.
To follow this tutorial, please visit
https://github.com/BodenmillerGroup/demos/tree/main/docs.
The compiled .html file of this workshop is hosted at:
https://bodenmillergroup.github.io/demos.
We will need to install the following packages for the workshop:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c("imcRtools", "tidyverse", "patchwork",
"ggplot2", "viridis", "pheatmap", "scales")))
To reproduce the analysis, clone the repository:
git clone https://github.com/BodenmillerGroup/demos.git
and open the EuroBioc2022_workshop.Rmd file in the docs folder.
Highly multiplexed imaging enables the simultaneous detection of tens of biological molecules (e.g. proteins, RNA; also referred to as “markers”) in their spatial tissue context. Recently established multiplexed imaging technologies rely on cyclic staining with immunofluorescently-tagged antibodies (Lin et al. 2018; Gut, Herrmann, and Pelkmans 2018), or the use of oligonucleotide-tagged (Saka et al. 2019; Goltsev et al. 2018) or metal-tagged antibodies (Giesen et al. 2014; Angelo et al. 2014), among others. Across technologies, the acquired data are commonly stored as multi-channel images, where each pixel encodes the abundance of all acquired markers at a specific position in the tissue. After data acquisition, bioimage processing and segmentation are conducted to extract data for downstream analysis. When performing end-to-end multiplexed image analysis, the user is often faced with a diverse set of computational tools and complex analysis scripts. We developed an interoperabale, modularized computational workflow to process and analyze multiplexed imaging data (Figure 1). The steinbock framework facilitates multi-channel image processing including raw data pre-processing, image segmentation and feature extraction. Data generated by steinbock can be directly read by the imcRtools R/Bioconductor package for data visualization and spatial analysis (Figure 1). The cytomapper package support image handling and composite as well as segmentation mask visualization.
Figure 1: Overview of the multiplexed image processing and analysis workflow. Raw image data can be interactively visualized using napari plugins such as napari-imc for IMC, to assess data quality and for exploratory visualization. The steinbock framework performs image pre-processing, cell segmentation and single-cell data extraction using established approaches and standardized file formats. Data can be imported into R using the imcRtools package, which further supports spatial visualization and analysis. Storing the data in a SingleCellExperiment or SpatialExperiment object, imcRtools integrates with a variety of data analysis tools of the Bioconductor project such as cytomapper (Eling et al. 2020). Alternatively, steinbock exports data to the anndata format for analysis in Python, e.g. using squidpy.
The workshop focuses on the spatial analysis of single-cell data obtained from multiplexed imaging technologies. While the concepts presented here can be applied in a technology-agnostic way, for demonstration purposes, we present data from Imaging Mass Cytometry (IMC), which relies on tissue staining with metal-labelled antibodies to jointly measure the spatial distribution of up to 40 proteins or RNA at 1μm resolution (Giesen et al. 2014; Schulz et al. 2018).
The workshop highlights the following analysis steps using the imcRtools R/Bioconductor package:
The analysis approaches presented here were taken from the IMC data analysis book. The book provides more detailed information on the technical underpinnings of the analysis.
More information can also be found in our preprint
To highlight spatial data analysis approaches, we provide example data that were acquired as part of the Integrated iMMUnoprofiling of large adaptive CANcer patient cohorts projects (immucan.eu). The spatially annotated, single-cell data can be obtained via:
download.file("https://zenodo.org/record/6810879/files/spe.rds",
"../data/spe.rds")
(spe <- readRDS("../data/spe.rds"))
## class: SpatialExperiment
## dim: 40 46825
## metadata(4): color_vectors cluster_codes SOM_codes delta_area
## assays(2): counts exprs
## rownames(40): MPO HistoneH3 ... DNA1 DNA2
## rowData names(15): channel name ... marker_class used_for_clustering
## colnames(46825): Patient1_001_1 Patient1_001_2 ... Patient4_008_2772
## Patient4_008_2773
## colData names(20): sample_id ObjectNumber ... cell_labels celltype
## reducedDimNames(8): UMAP TSNE ... seurat UMAP_seurat
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : Pos_X Pos_Y
## imgData names(1): sample_id
This SpatialExperiment object contains single-cell data from 4 patients and 14
IMC images. Each column entry represents a single-cell and each row entry
represents a single marker. We quantify protein abundance as the mean pixel
intensity per marker and cell. These abundance measures are stored in the
counts assay and their asinh-transformed values are stored in the exprs
assay.
unique(spe$patient_id)
## [1] "Patient1" "Patient2" "Patient3" "Patient4"
unique(spe$sample_id)
## [1] "Patient1_001" "Patient1_002" "Patient1_003" "Patient2_001" "Patient2_002"
## [6] "Patient2_003" "Patient2_004" "Patient3_001" "Patient3_002" "Patient3_003"
## [11] "Patient4_005" "Patient4_006" "Patient4_007" "Patient4_008"
unique(spe$indication)
## [1] "SCCHN" "BCC" "NSCLC" "CRC"
unique(spe$celltype)
## [1] "Tumor" "Stroma" "Myeloid" "Plasma_cell" "Treg"
## [6] "CD8" "CD4" "undefined" "BnTcell" "Bcell"
## [11] "Neutrophil"
counts(spe)[1:5,1:5]
## Patient1_001_1 Patient1_001_2 Patient1_001_3 Patient1_001_4
## MPO 0.6273888 0.4500000 0.5286462 1.0191420
## HistoneH3 3.4095933 13.0472305 2.5274191 9.5967590
## SMA 0.0000000 0.8347940 0.0000000 0.5937160
## CD16 2.1288451 2.7879388 2.1463270 18.4293193
## CD38 0.1471510 0.5566445 0.9559316 0.5434799
## Patient1_001_5
## MPO 0.4000000
## HistoneH3 2.8898319
## SMA 0.0000000
## CD16 0.8185134
## CD38 0.3812283
assay(spe, "exprs")[1:5,1:5]
## Patient1_001_1 Patient1_001_2 Patient1_001_3 Patient1_001_4
## MPO 0.5921683 0.4360497 0.5066859 0.8948444
## HistoneH3 1.9405827 3.2631884 1.6573665 2.9572761
## SMA 0.0000000 0.7596078 0.0000000 0.5634290
## CD16 1.4998154 1.7491645 1.5072232 3.6078253
## CD38 0.1466251 0.5312943 0.8498667 0.5197596
## Patient1_001_5
## MPO 0.3900353
## HistoneH3 1.7830203
## SMA 0.0000000
## CD16 0.7470596
## CD38 0.3725504
The spatial location of each cell is stored in the spatialCoords slot.
head(spatialCoords(spe))
## Pos_X Pos_Y
## 1 468.8182 0.3636364
## 2 516.4500 0.4000000
## 3 587.5000 0.5000000
## 4 192.2632 0.8947368
## 5 231.5000 0.4000000
## 6 270.8000 0.8500000
During image processing, the steinbock framework extracts spatial object graphs
indicating cells in close physical proximity in form of an edgelist. This interaction
graph is stored in:
colPair(spe, type = "neighborhood")
## SelfHits object with 246584 hits and 0 metadata columns:
## from to
## <integer> <integer>
## [1] 1 27
## [2] 1 54
## [3] 2 10
## [4] 2 43
## [5] 3 16
## ... ... ...
## [246580] 46824 46802
## [246581] 46825 46762
## [246582] 46825 46787
## [246583] 46825 46796
## [246584] 46825 46820
## -------
## nnode: 46825
We can also visualize all cells.
library(imcRtools)
library(ggplot2)
plotSpatial(spe,
node_color_by = "celltype",
img_id = "sample_id",
node_size_fix = 0.5) +
scale_color_manual(values = metadata(spe)$color_vectors$celltype)
Many spatial analysis approaches either compare the observed versus expected number of cells around a given cell type (point process) or utilize interaction graphs (spatial object graphs) to estimate clustering or interaction frequencies between cell types.
The steinbock framework allows the construction of these spatial graphs. During image processing, we have constructed a spatial graph by expanding the individual cell masks by 4 pixels.
The imcRtools package further allows the ad hoc construction of spatial
graphs directly using a SpatialExperiment or SingleCellExperiment object
while considering the spatial location (centroids) of individual cells. The
buildSpatialGraph
function allows constructing spatial graphs by detecting the k-nearest neighbors
in 2D (knn), by detecting all cells within a given distance to the center cell
(expansion) and by Delaunay triangulation (delaunay).
When constructing a knn graph, the number of neighbors (k) needs to be set and
(optionally) the maximum distance to consider (max_dist) can be specified.
When constructing a graph via expansion, the distance to expand (threshold)
needs to be provided. For graphs constructed via Delaunay triangulation,
the max_dist parameter can be set to avoid unusually large connections at the
edge of the image.
spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "knn", k = 20)
spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "expansion", threshold = 20)
spe <- buildSpatialGraph(spe, img_id = "sample_id", type = "delaunay", max_dist = 50)
The spatial graphs are stored in colPair(spe, name) slots. These slots store
SelfHits objects representing edge lists in which the first column indicates
the index of the “from” cell and the second column the index of the “to” cell.
Each edge list is newly constructed when subsetting the object.
colPairNames(spe)
## [1] "neighborhood" "knn_interaction_graph"
## [3] "expansion_interaction_graph" "delaunay_interaction_graph"
Here, colPair(spe, "neighborhood") stores the spatial graph constructed by
steinbock, colPair(spe, "knn_interaction_graph") stores the knn spatial
graph, colPair(spe, "expansion_interaction_graph") stores the expansion graph
and colPair(spe, "delaunay_interaction_graph") stores the graph constructed by
Delaunay triangulation.
Here, we introduce the plotSpatial function of the imcRtools package to visualize the cells’ centroids and cell-cell interactions as spatial graphs.
In the following example, we select one image for visualization purposes.
Here, each dot (node) represents a cell and edges are drawn between cells
in close physical proximity as detected by steinbock or the buildSpatialGraph
function. Nodes are variably colored based on the cell type and edges are
colored in grey.
library(viridis)
# steinbock interaction graph
plotSpatial(spe[,spe$sample_id == "Patient3_001"],
node_color_by = "celltype",
img_id = "sample_id",
draw_edges = TRUE,
colPairName = "neighborhood",
nodes_first = FALSE,
edge_color_fix = "grey") +
scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
ggtitle("steinbock interaction graph")
# knn interaction graph
plotSpatial(spe[,spe$sample_id == "Patient3_001"],
node_color_by = "celltype",
img_id = "sample_id",
draw_edges = TRUE,
colPairName = "knn_interaction_graph",
nodes_first = FALSE,
edge_color_fix = "grey") +
scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
ggtitle("knn interaction graph")
# expansion interaction graph
plotSpatial(spe[,spe$sample_id == "Patient3_001"],
node_color_by = "celltype",
img_id = "sample_id",
draw_edges = TRUE,
colPairName = "expansion_interaction_graph",
nodes_first = FALSE,
directed = FALSE,
edge_color_fix = "grey") +
scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
ggtitle("expansion interaction graph")
# delaunay interaction graph
plotSpatial(spe[,spe$sample_id == "Patient3_001"],
node_color_by = "celltype",
img_id = "sample_id",
draw_edges = TRUE,
colPairName = "delaunay_interaction_graph",
nodes_first = FALSE,
edge_color_fix = "grey") +
scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
ggtitle("delaunay interaction graph")
The detection of spatial communities was proposed by (Jackson et al. 2020). Here, cells are clustered solely based on their interactions as defined by the spatial object graph. In the following example, we perform spatial community detection separately for tumor and stromal cells.
The general procedure is as follows:
1. subset the object to contain either tumor or stromal cells
2. create an igraph object from the edge list stored in
colPair(tumor_spe, "neighborhood")
3. perform community detection using the Louvain algorithm
4. store the community IDs in a vector and replace all communities with a size
smaller than 10 by NA
Both tumor and stromal spatial communities are stored in the colData of
the SpatialExperiment object.
library(igraph)
set.seed(220819)
# Spatial community detection - tumor
tumor_spe <- spe[,spe$celltype == "Tumor"]
gr <- graph_from_data_frame(as.data.frame(colPair(tumor_spe, "neighborhood")),
directed = FALSE,
vertices = data.frame(index = seq_len(ncol(tumor_spe))))
cl_comm <- cluster_louvain(gr)
comm_tumor <- paste0("Tumor_", membership(cl_comm))
comm_tumor[membership(cl_comm) %in% which(sizes(cl_comm) < 10)] <- NA
names(comm_tumor) <- colnames(tumor_spe)
# Spatial community detection - non-tumor
stroma_spe <- spe[,spe$celltype != "Tumor"]
gr <- graph_from_data_frame(as.data.frame(colPair(stroma_spe, "neighborhood")),
directed = FALSE,
vertices = data.frame(index = seq_len(ncol(stroma_spe))))
cl_comm <- cluster_louvain(gr)
comm_stroma <- paste0("Stroma_", membership(cl_comm))
comm_stroma[membership(cl_comm) %in% which(sizes(cl_comm) < 10)] <- NA
names(comm_stroma) <- colnames(stroma_spe)
comm <- c(comm_tumor, comm_stroma)
spe$spatial_community <- comm[colnames(spe)]
We can now separately visualize the tumor and stromal communities.
plotSpatial(spe[,spe$celltype == "Tumor"],
node_color_by = "spatial_community",
img_id = "sample_id",
node_size_fix = 0.5) +
theme(legend.position = "none") +
ggtitle("Spatial tumor communities") +
scale_color_manual(values = rev(colors()))
plotSpatial(spe[,spe$celltype != "Tumor"],
node_color_by = "spatial_community",
img_id = "sample_id",
node_size_fix = 0.5) +
theme(legend.position = "none") +
ggtitle("Spatial non-tumor communities") +
scale_color_manual(values = rev(colors()))
The following section highlights the use of the imcRtools package to detect
cellular neighborhoods. This approach has been proposed by (Goltsev et al. 2018) and
(Schürch et al. 2020) to group cells based on information contained in their direct
neighborhood.
(Goltsev et al. 2018) perfomed Delaunay triangulation-based graph construction, neighborhood aggregation and then clustered cells. (Schürch et al. 2020) on the other hand constructed a 10-nearest neighbor graph before aggregating information across neighboring cells.
In the following code chunk we will use the 20-nearest neighbor graph as constructed above to define the direct cellular neighborhood. The aggregateNeighbors function allows neighborhood aggregation in 2 different ways:
Based on these measures, cells can now be clustered into cellular neighborhoods. We will first compute the fraction of the different cell types among the 20-nearest neighbors and use kmeans clustering to group cells into 6 cellular neighborhoods.
# By celltypes
spe <- aggregateNeighbors(spe, colPairName = "knn_interaction_graph",
aggregate_by = "metadata", count_by = "celltype")
set.seed(220705)
cn_1 <- kmeans(spe$aggregatedNeighbors, centers = 6)
spe$cn_celltypes <- as.factor(cn_1$cluster)
plotSpatial(spe,
node_color_by = "cn_celltypes",
img_id = "sample_id",
node_size_fix = 0.5) +
scale_color_brewer(palette = "Set3")
The next code chunk visualizes the cell type compositions of the detected cellular neighborhoods (CN).
library(pheatmap)
for_plot <- prop.table(table(spe$cn_celltypes, spe$celltype), margin = 1)
pheatmap(for_plot,
color = colorRampPalette(c("dark blue", "white", "dark red"))(100),
scale = "column")
CN 1 and CN 5 are mainly composed of tumor cells with CN 5 forming the tumor/stroma border. CN 3 is mainly composed of B and BnT cells indicating TLS. CN 4 is composed of aggregated plasma cells and most T cells.
Downstream of CN assignments, we will analyze the spatial context (SC)
of each cell using three functions from imcRtools.
While CNs can represent sites of unique local processes, the term SC was coined by Bhate and colleagues (Bhate et al. 2022) and describes tissue regions in which distinct CNs may be interacting. Hence, SCs may be interesting regions of specialized biological events.
Here, we will first detect SCs using the
detectSpatialContext
function. This function relies on CN fractions for each cell in a spatial
interaction graph (originally a KNN graph), which we will calculate using
buildSpatialGraph and aggregateNeighbors. We will focus on the CNs derived
from cell type fractions but other CN assignments are possible.
Of note, the window size (k for KNN) for buildSpatialGraph should
reflect a length scale on which biological signals can be exchanged and
depends, among others, on cell density and tissue area. In view of their
divergent functionality, we recommend to use a larger window size for SC
(interaction between local processes) than for CN (local processes)
detection. Since we used a 20-nearest neighbor graph for CN assignment,
we will use a 40-nearest neighbor graph for SC detection.
Subsequently, the CN fractions are sorted from high-to-low and the SC of each cell is assigned as the minimal combination of SCs that additively surpass a user-defined threshold. The default threshold of 0.9 aims to represent the dominant CNs, hence the most prevalent signals, in a given window.
For more details and biological validation, please refer to (Bhate et al. 2022).
library(circlize)
library(RColorBrewer)
# Generate k-nearest neighbor graph for SC detection (k=40)
spe <- buildSpatialGraph(spe, img_id = "sample_id",
type = "knn",
name = "knn_spatialcontext_graph",
k = 40)
# Aggregate based on clustered_neighbors
spe <- aggregateNeighbors(spe,
colPairName = "knn_spatialcontext_graph",
aggregate_by = "metadata",
count_by = "cn_celltypes",
name = "aggregatedNeighborhood")
# Detect spatial contexts
spe <- detectSpatialContext(spe,
entry = "aggregatedNeighborhood",
threshold = 0.90,
name = "spatial_context")
# Define SC color scheme
col_SC <- setNames(colorRampPalette(brewer.pal(9, "Paired"))(length(unique(spe$spatial_context))),
sort(unique(spe$spatial_context)))
We detect a total of 50 distinct SCs across this dataset.
For ease of interpretation, we will directly compare the CN and SC
assignments for Patient3_001.
library(patchwork)
# Compare CN and SC for one patient
p1 <- plotSpatial(spe[,spe$sample_id == "Patient3_001"],
node_color_by = "cn_celltypes",
img_id = "sample_id",
node_size_fix = 0.5,
colPairName = "knn_interaction_graph") +
scale_color_brewer(palette = "Set3")
p2 <- plotSpatial(spe[,spe$sample_id == "Patient3_001"],
node_color_by = "spatial_context",
img_id = "sample_id",
node_size_fix = 0.5,
colPairName = "knn_spatialcontext_graph") +
scale_color_manual(values = col_SC, limits = force)
p1 + p2
As expected, we can observe that interfaces between different CNs make up distinct SCs. For instance, interface between CN 3 (TLS region consisting of B and BnT cells) and CN 4 (Plasma- and T-cell dominated) turns to SC 3_4. On the other hand, the core of CN 3 becomes SC 3, since for the neighborhood for these cells is just the cellular neighborhood itself.
Next, we filter the SCs based on user-defined thresholds for number of group entries (here at least 3 patients) and/or total number of cells (here minimum of 100 cells) per SC with filterSpatialContext.
# By number of group entries and total number of cells
spe <- filterSpatialContext(spe,
entry = "spatial_context",
group_by = "patient_id",
group_threshold = 3,
cells_threshold = 100)
Lastly, we can use the
plotSpatialContext
function to generate SC graphs, analogous to CN combination maps in
(Bhate et al. 2022). Returned objects are ggplot objects, which can be easily
modified further. We will create a SC graph for the filtered SCs here.
# Colored by name and size by n_cells
plotSpatialContext(spe,
entry = "spatial_context_filtered",
group_by = "sample_id",
node_color_by = "name",
node_size_by = "n_cells",
node_label_color_by = "name")
# Colored by n_cells and size by n_group
plotSpatialContext(spe,
entry = "spatial_context_filtered",
group_by = "sample_id",
node_color_by = "n_cells",
node_size_by = "n_group",
node_label_color_by = "n_cells") +
scale_color_viridis()
SC 1 (Tumor-dominated), SC 1_5 (Tumor and Tumor-Stroma interface) and SC 2_4 (Plasma/T cell and Myeloid/Neutrophil interface) are the most frequent SCs in this dataset. Moreover, we may compare the degree of the different nodes in the SC graph. For example, we can observe that SC 1 has only one degree (directed to SC 1_5), while SC 5 (Tumor-Stroma) has a much higher degree (n = 5) and potentially more interaction.
The previous section focused on detecting cellular neighborhoods in a rather
unsupervised fashion. However, the imcRtools package also provides methods for
detecting spatial compartments in a supervised fashion. The
patchDetection
function allows the detection of connected sets of similar cells as proposed by
(Hoch et al. 2022). In the following example, we will use the patchDetection function
to detect tumor patches in three steps:
steinbock graph).spe <- patchDetection(spe,
patch_cells = spe$celltype == "Tumor",
img_id = "sample_id",
expand_by = 1,
min_patch_size = 10,
colPairName = "neighborhood")
plotSpatial(spe,
node_color_by = "patch_id",
img_id = "sample_id",
node_size_fix = 0.5) +
theme(legend.position = "none") +
scale_color_manual(values = rev(colors()))
We can now measure the size of each patch using the patchSize function and visualize tumor patch distribution per patient.
patch_size <- patchSize(spe, "patch_id")
patch_size <- merge(patch_size,
colData(spe)[match(patch_size$patch_id, spe$patch_id),],
by = "patch_id")
ggplot(as.data.frame(patch_size)) +
geom_boxplot(aes(patient_id, log10(size))) +
geom_point(aes(patient_id, log10(size)))
The minDistToCells function can be used to calculate the minimum distance between each cell and a cell set of interest. Here, we highlight its use to calculate the minimum distance of all cells to the detected tumor patches. Negative values indicate the minimum distance of each tumor patch cell to a non-tumor patch cell.
spe <- minDistToCells(spe,
x_cells = !is.na(spe$patch_id),
img_id = "sample_id")
plotSpatial(spe,
node_color_by = "distToCells",
img_id = "sample_id",
node_size_fix = 0.5) +
scale_color_gradient2(low = "dark blue", mid = "white", high = "dark red")
We can now observe th minimum distances to tumor patches in a cell type specific manner.
library(ggridges)
ggplot(as.data.frame(colData(spe))) +
geom_density_ridges(aes(distToCells, celltype, fill = celltype)) +
geom_vline(xintercept = 0, color = "dark red", size = 2) +
scale_fill_manual(values = metadata(spe)$color_vectors$celltype)
The next section focuses on statistically testing the pairwise interaction
between all cell types of the dataset. For this, the imcRtools package
provides the
testInteractions
function which implements the interaction testing strategy proposed by
(Schapiro et al. 2017).
Per grouping level (e.g., image), the testInteractions function computes the
averaged cell type/cell type interaction count and compares this count against
an empirical null distribution which is generated by permuting all cell labels
(while maintaining the tissue structure).
In the following example, we use the steinbock generated spatial interaction
graph and estimate the interaction or avoidance between cell types in the
dataset.
out <- testInteractions(spe,
group_by = "sample_id",
label = "celltype",
colPairName = "neighborhood",
iter = 200)
head(out)
## DataFrame with 6 rows and 10 columns
## group_by from_label to_label ct p_gt p_lt interaction
## <character> <factor> <factor> <numeric> <numeric> <numeric> <logical>
## 1 Patient1_001 Bcell Bcell 1 0.00995025 1.000000 TRUE
## 2 Patient1_001 Bcell BnTcell 0 1.00000000 1.000000 FALSE
## 3 Patient1_001 Bcell CD4 2 0.00497512 1.000000 TRUE
## 4 Patient1_001 Bcell CD8 0 1.00000000 0.855721 FALSE
## 5 Patient1_001 Bcell Myeloid 0 1.00000000 0.686567 FALSE
## 6 Patient1_001 Bcell Neutrophil NA NA NA NA
## p sig sigval
## <numeric> <logical> <numeric>
## 1 0.00995025 TRUE 1
## 2 1.00000000 FALSE 0
## 3 0.00497512 TRUE 1
## 4 0.85572139 FALSE 0
## 5 0.68656716 FALSE 0
## 6 NA NA NA
The returned DataFrame contains the test results per grouping level (in this case
the image ID, group_by), “from” cell type (from_label) and “to” cell type
(to_label). The sigval entry indicates if a pair of cell types is
significantly interacting (sigval = 1), if a pair of cell types is
significantly avoiding (sigval = -1) or if no significant interaction or
avoidance was detected.
These results can be visualized by computing the sum of the sigval entries
across all images:
library(tidyverse)
library(scales)
out %>% as_tibble() %>%
group_by(from_label, to_label) %>%
summarize(sum_sigval = sum(sigval, na.rm = TRUE)) %>%
ggplot() +
geom_tile(aes(from_label, to_label, fill = sum_sigval)) +
scale_fill_gradient2(low = muted("blue"), mid = "white", high = muted("red")) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
In the plot above the red tiles indicate cell type pairs that were detected to significantly interact on a large number of images. On the other hand, blue tiles show cell type pairs which tend to avoid each other on a large number of images.
Here we can observe that tumor cells are mostly compartmentalized and are in avoidance with other cell types. As expected, B cells interact with BnT cells; regulatory T cells interact with CD4+ T cells and CD8+ T cells. Most cell types show self interactions indicating spatial clustering.
The imcRtools package further implements an interaction testing strategy
proposed by (Schulz et al. 2018) where the hypothesis is tested if at least n cells of
a certain type are located around a target cell type (from_cell). This type of
testing can be performed by selecting method = "patch" and specifying the
number of patch cells via the patch_size parameter.
out <- testInteractions(spe,
group_by = "sample_id",
label = "celltype",
colPairName = "neighborhood",
method = "patch",
patch_size = 3,
iter = 200)
out %>% as_tibble() %>%
group_by(from_label, to_label) %>%
summarize(sum_sigval = sum(sigval, na.rm = TRUE)) %>%
ggplot() +
geom_tile(aes(from_label, to_label, fill = sum_sigval)) +
scale_fill_gradient2(low = muted("blue"), mid = "white", high = muted("red")) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
These results are comparable to the interaction testing presented above. The main difference comes from the lack of symmetry. We can now for example see that 3 or more myeloid cells sit around CD4\(^+\) T cells while this interaction is not as strong when considering CD4\(^+\) T cells sitting around myeloid cells.
The IMC data analysis book contains a detailed overview on the presented and other approaches for multiplexed image analysis and visualization.
The steinbock framework provides functionality for image processing.
The ImcSegmentationPipeline provides a GUI-based version of the segmentation pipeline based on Ilastik pixel classification and image segmentation via CellProfiler.
The cytomapper package allows handling and visualization of multi-channel images and segmentation masks directly in R.
The imcRtools package supports reading single-cell data from segmented images, multi-channel spillover correction, and spatial data analysis.
The imcdatasets R/Bioconductor package contains a number of publically available IMC datasets.
For a full overview on the presented approaches, please refer to the preprint:
Jonas Windhager developed the steinbock framework. Vito Zanotelli developed
the IMC segmentation pipeline. Daniel Schulz, Tobias Hoch, Lasse Meyer, Jana
Fischer, and Vito Zanotelli provided code for the imcRtools
package.
Nils Eling is funded by Marie Sklodowska Curie Actions.
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] scales_1.2.1 forcats_0.5.2
## [3] stringr_1.4.1 dplyr_1.0.10
## [5] purrr_0.3.4 readr_2.1.2
## [7] tidyr_1.2.1 tibble_3.1.8
## [9] tidyverse_1.3.2 ggridges_0.5.3
## [11] patchwork_1.1.2 RColorBrewer_1.1-3
## [13] circlize_0.4.15 pheatmap_1.0.12
## [15] igraph_1.3.4 viridis_0.6.2
## [17] viridisLite_0.4.1 ggplot2_3.3.6
## [19] imcRtools_1.3.7 SpatialExperiment_1.6.1
## [21] SingleCellExperiment_1.18.0 SummarizedExperiment_1.26.1
## [23] Biobase_2.56.0 GenomicRanges_1.48.0
## [25] GenomeInfoDb_1.32.4 IRanges_2.30.1
## [27] S4Vectors_0.34.0 BiocGenerics_0.42.0
## [29] MatrixGenerics_1.8.1 matrixStats_0.62.0
## [31] BiocStyle_2.24.0
##
## loaded via a namespace (and not attached):
## [1] scattermore_0.8 flowWorkspace_4.8.0
## [3] R.methodsS3_1.8.2 bit64_4.0.5
## [5] knitr_1.40 irlba_2.3.5
## [7] multcomp_1.4-20 DelayedArray_0.22.0
## [9] R.utils_2.12.0 data.table_1.14.2
## [11] RCurl_1.98-1.8 doParallel_1.0.17
## [13] generics_0.1.3 flowCore_2.8.0
## [15] ScaledMatrix_1.4.0 terra_1.6-7
## [17] cowplot_1.1.1 TH.data_1.1-1
## [19] proxy_0.4-27 ggpointdensity_0.1.0
## [21] bit_4.0.4 tzdb_0.3.0
## [23] lubridate_1.8.0 xml2_1.3.3
## [25] httpuv_1.6.6 assertthat_0.2.1
## [27] gargle_1.2.1 fontawesome_0.3.0
## [29] xfun_0.32 hms_1.1.2
## [31] jquerylib_0.1.4 evaluate_0.16
## [33] promises_1.2.0.1 fansi_1.0.3
## [35] readxl_1.4.1 dbplyr_2.2.1
## [37] Rgraphviz_2.40.0 DBI_1.1.3
## [39] CATALYST_1.20.1 htmlwidgets_1.5.4
## [41] googledrive_2.0.0 ellipsis_0.3.2
## [43] ggcyto_1.24.1 ggnewscale_0.4.7
## [45] ggpubr_0.4.0 backports_1.4.1
## [47] V8_4.2.1 cytolib_2.8.0
## [49] bookdown_0.28 svgPanZoom_0.3.4
## [51] RcppParallel_5.1.5 deldir_1.0-6
## [53] sparseMatrixStats_1.8.0 vctrs_0.4.1
## [55] abind_1.4-5 cachem_1.0.6
## [57] withr_2.5.0 ggforce_0.3.4
## [59] aws.signature_0.6.0 cytomapper_1.9.1
## [61] vroom_1.5.7 svglite_2.1.0
## [63] cluster_2.1.4 crayon_1.5.1
## [65] drc_3.0-1 labeling_0.4.2
## [67] units_0.8-0 edgeR_3.38.4
## [69] pkgconfig_2.0.3 tweenr_2.0.2
## [71] vipor_0.4.5 rlang_1.0.5
## [73] lifecycle_1.0.2 sandwich_3.0-2
## [75] modelr_0.1.9 rsvd_1.0.5
## [77] cellranger_1.1.0 polyclip_1.10-0
## [79] graph_1.74.0 tiff_0.1-11
## [81] Matrix_1.4-1 raster_3.5-29
## [83] carData_3.0-5 Rhdf5lib_1.18.2
## [85] zoo_1.8-10 reprex_2.0.2
## [87] base64enc_0.1-3 beeswarm_0.4.0
## [89] RTriangle_1.6-0.10 googlesheets4_1.0.1
## [91] GlobalOptions_0.1.2 png_0.1-7
## [93] rjson_0.2.21 shinydashboard_0.7.2
## [95] bitops_1.0-7 R.oo_1.25.0
## [97] KernSmooth_2.23-20 ConsensusClusterPlus_1.60.0
## [99] rhdf5filters_1.8.0 DelayedMatrixStats_1.18.0
## [101] classInt_0.4-7 shape_1.4.6
## [103] jpeg_0.1-9 rstatix_0.7.0
## [105] ggsignif_0.6.3 aws.s3_0.3.21
## [107] beachmat_2.12.0 magrittr_2.0.3
## [109] plyr_1.8.7 hexbin_1.28.2
## [111] zlibbioc_1.42.0 compiler_4.2.1
## [113] dqrng_0.3.0 concaveman_1.1.0
## [115] plotrix_3.8-2 clue_0.3-61
## [117] cli_3.4.0 XVector_0.36.0
## [119] ncdfFlow_2.42.1 FlowSOM_2.4.0
## [121] MASS_7.3-58.1 tidyselect_1.1.2
## [123] stringi_1.7.8 RProtoBufLib_2.8.0
## [125] highr_0.9 yaml_2.3.5
## [127] BiocSingular_1.12.0 locfit_1.5-9.6
## [129] latticeExtra_0.6-30 ggrepel_0.9.1
## [131] grid_4.2.1 sass_0.4.2
## [133] EBImage_4.38.0 tools_4.2.1
## [135] parallel_4.2.1 CytoML_2.8.1
## [137] rstudioapi_0.14 foreach_1.5.2
## [139] gridExtra_2.3 farver_2.1.1
## [141] Rtsne_0.16 ggraph_2.0.6
## [143] DropletUtils_1.16.0 digest_0.6.29
## [145] BiocManager_1.30.18 shiny_1.7.2
## [147] Rcpp_1.0.9 car_3.1-0
## [149] broom_1.0.1 scuttle_1.6.3
## [151] later_1.3.0 httr_1.4.4
## [153] sf_1.0-8 ComplexHeatmap_2.12.1
## [155] distances_0.1.8 colorspace_2.0-3
## [157] rvest_1.0.3 fs_1.5.2
## [159] XML_3.99-0.10 splines_4.2.1
## [161] RBGL_1.72.0 scater_1.24.0
## [163] graphlayouts_0.8.1 sp_1.5-0
## [165] systemfonts_1.0.4 xtable_1.8-4
## [167] jsonlite_1.8.0 tidygraph_1.2.2
## [169] R6_2.5.1 pillar_1.8.1
## [171] htmltools_0.5.3 mime_0.12
## [173] nnls_1.4 glue_1.6.2
## [175] fastmap_1.1.0 DT_0.24
## [177] BiocParallel_1.30.3 BiocNeighbors_1.14.0
## [179] fftwtools_0.9-11 class_7.3-20
## [181] codetools_0.2-18 mvtnorm_1.1-3
## [183] utf8_1.2.2 lattice_0.20-45
## [185] bslib_0.4.0 curl_4.3.2
## [187] ggbeeswarm_0.6.0 colorRamps_2.3.1
## [189] gtools_3.9.3 magick_2.7.3
## [191] interp_1.1-3 survival_3.4-0
## [193] limma_3.52.2 rmarkdown_2.16
## [195] munsell_0.5.0 e1071_1.7-11
## [197] GetoptLong_1.0.5 rhdf5_2.40.0
## [199] GenomeInfoDbData_1.2.8 iterators_1.0.14
## [201] HDF5Array_1.24.2 haven_2.5.1
## [203] reshape2_1.4.4 gtable_0.3.1